feat: verda cloud (gpu/ai) provider support#142
Merged
rafeegnash merged 23 commits intomasterfrom Apr 20, 2026
Merged
Conversation
added 23 commits
April 20, 2026 01:40
- release mutex during oauth token fetch to avoid deadlock risk - honor context cancellation during retry backoffs - proactively bound polling sleep to remaining deadline - drop `offline` from terminal instance status so transient stops don't end polling early - atomic rename on conversation history save so a crash mid-write can't corrupt it
- pass remote path to ssh as a separate argv token and reject shell metacharacters so a dynamic candidate list cannot inject commands - add BatchMode=yes to ssh so kubeconfig reads fail fast on missing keys instead of prompting for a password - lowercase uuid inputs before short-circuiting hostname lookup so pasted uppercase ids resolve correctly
- resolve hostname to instance id in verda action so users can pass names instead of uuids - add container-deployments, job-deployments, container-types, secrets, file-secrets, registry-creds, and balance to verda list - gather container and job deployment context for ask-mode prompts - propagate cmd.Context to handleVerdaQuery so ctrl-c cancels in-flight api calls - check json.Marshal error on action payload instead of discarding it
- uuid parsing, scp target splitting, kubeconfig server rewrite, and the nil-client guard now have coverage in internal/k8s/cluster - ResolveInstanceID verifies hostname lookup, uuid short-circuit, uppercase normalisation, and unknown-name error path - sleepCtx exercises both cancellation and zero-duration paths - routing tests confirm verda keyword hits, datacrunch alias, default-provider fallback, no-false-positive on bare 'gpu', and LLM classification clearing the right providers - tighten looksLikeUUID to lowercase-only since resolveClusterID already lowercases, matching the verda package mirror
cloudflare, k8s, gcp, azure, digitalocean, hetzner, aws, and iam branches all leaked ctx.Verda through when the llm picked a different provider. a keyword-inferred verda signal paired with an llm-picked aws query would surface both flags and downstream code would run both paths. add a parametric regression test so every sibling provider gets checked.
- mcp `clanker_verda_list` switch gains containers, jobs, secrets, file-secrets, and registry-creds — matches the cli surface added in the previous commit - error on unknown resource now lists every supported value so users don't need to open docs - `clanker verda balance` prints a human-readable `Balance: $42.17 USD` line in addition to the json body, skipped with --raw for scriptability - mcp input schema description updated so agents that read the schema know the full enum
pickKubernetesClusterImage used to fall back silently to the default cluster image when no image carried a kubernetes/k8s hint. the cluster would then provision without k8s installed, and GetKubeconfig would later fail opaquely looking for /root/.kube/config. return the label-match as a second boolean so Create can emit a visible warning with the image id in the log, giving the caller a chance to abort or inject a startup script.
previously any path that reached RunVerdaCLI* surfaced a generic exec failure like "exec: not found". with this change: - new typed sentinel ErrCLINotInstalled + IsCLINotInstalled helper so callers can branch without string matching. the error message now points at the docs + brew install + REST fallback rather than a bare "not found" - CLIInstalled() preflight lets callers show cli-aware ux eagerly instead of waiting for a real exec error - resolveVerdaCredentials suggests `verda auth login` only when the binary is actually installed; otherwise it drops that line so the user isn't steered toward a command they can't run - k8s agent auto-registration now logs a specific reason when the provider is skipped (no creds / partial creds / NewClient failed) at debug level so users investigating "verda-instant missing from cluster types" get a breadcrumb tests cover ErrCLINotInstalled returning from a stubbed empty PATH, the wrapped error message mentioning docs.verda.com and the REST fallback, and CLIInstalled returning false on an empty PATH.
two concurrent `clanker ask --verda` invocations for the same project could race on the tmp-file + rename, dropping one conversation's history without returning an error. add a package-level map of per-scope sync.Mutex so both Save and Load take the same lock before touching the file — write barrier first, then the existing struct RWMutex for in-memory state. tests cover the save/load round trip, a 20-goroutine storm against the same scope that asserts the final file parses as valid json, and a leak check that confirms only the final file remains (no stray verda_*.tmp). go test -race is clean.
matches the pattern resolveHetznerToken / resolveVercelToken already use. when local resolution (config → env → ~/.verda/credentials) comes up empty and a clanker backend api key is configured, we now try the backend credential store. a 404 from the backend (because the server-side route may not be provisioned yet) gracefully falls through to the existing human-readable "not configured" error. - new VerdaCredentials type + ProviderVerda constant in internal/backend/types.go - GetVerdaCredentials + StoreVerdaCredentials client methods following the existing Vercel/Hetzner shape so `clanker credentials store verda ...` can be added server-side later without client updates - resolveVerdaCredentialsWithContext keeps the sync-safe resolveVerdaCredentials wrapper for non-ctx callers, but handleVerdaQuery now uses the ctx variant so cancellation during the backend round-trip is honoured
the ask-mode backend fallback we just added reads from the clanker backend credential store, but users had no way to push verda creds to it locally. this closes that loop. - storeVerdaCredentials reads client-id/secret/project-id from --flag, then verda.Resolve* chain (viper → env → ~/.verda/credentials) and uploads to PUT /api/v1/cli/credentials/verda via the new backend client method - --client-id / --client-secret / --project-id flags added to the store command with the other provider flags - help text and usage examples updated to include verda - friendly error when both credential fields are missing points at every configuration path (flags, yaml, env, `verda auth login`)
`clanker credentials test verda` pulls the stored client_id/secret from the clanker backend, spins up a verda.Client, and hits /v1/balance as the cheapest authenticated probe. debug mode prints the returned balance. `clanker credentials delete verda` was accepting "verda" as a provider string via the store path but never reached delete — routes ProviderVerda now in runCredentialsDelete.
VerdaPlanPromptWithMode instructs the llm to emit a rest-first plan: args are [verda-api, METHOD, /v1/path, body?] instead of a shell command. this keeps maker execution independent of the verda cli (which may not be installed) and reuses the well-understood verda oauth2 + retry client. prompt covers the verda api's most common flows: list, create instance, lifecycle actions (start/shutdown/delete/hibernate), create volume, attach volume, create ssh key, create startup script, create instant cluster with kubernetes image, discontinue cluster, check balance, and enumerate instance-types. documents the binding format so the planner can chain commands (INSTANCE_ID -> next command).
ExecuteVerdaPlan dispatches [verda-api, METHOD, /v1/path, body?] commands directly through verda.Client — no shell-out, no cli dependency. the existing oauth2 token caching, 429 backoff, and typed error decoding all apply unchanged. plan bindings (<PLACEHOLDER>) and `produces` jsonpath capture are wired through applyPlanBindings + learnPlanBindingsFromProduces so multi-step plans (create instance -> start instance) compose cleanly. validateVerdaCommand enforces: - exactly 3-4 args [verda-api, METHOD, path, body?] - method in GET/POST/PUT/PATCH/DELETE - path prefix /v1/ - no newlines in any arg - destructive operations (DELETE, action=delete|discontinue|force_shutdown| delete_stuck|hibernate) gated behind --destroyer ExecOptions gains VerdaClientID/VerdaClientSecret/VerdaProjectID fields. internal/verda exposes SetBaseURLForTest so the maker's test file can redirect the executor at an httptest server without touching production. tests cover shape validation, destructive gating (delete/discontinue pass only with --destroyer; start passes always), and end-to-end execution with a real httptest server — verifying oauth2 cache reuse (one token call across two api calls) and jsonpath binding substitution from one command into the next.
ask.go now routes --maker --verda through the full plan + apply cycle:
- drop the hard-coded "not yet supported for verda" guard
- explicitVerda selects makerProvider="verda" (explicit reason)
- svcCtx.Verda inference selects makerProvider="verda" (inferred reason)
- maker prompt switch calls maker.VerdaPlanPromptWithMode
- verda is included in the read-only "output only" provider list so
--maker (without --apply) prints the plan without trying to run it
through the aws enrichment pipeline
- --apply path: resolve verda credentials (backend-fallback-enabled) and
call maker.ExecuteVerdaPlan with Client{ID,Secret} + ProjectID threaded
through ExecOptions
end-to-end: `clanker ask --maker --verda "create one h100 in FIN-01"`
emits a json plan; `clanker ask --maker --verda --apply < plan.json` runs
the plan's verda-api commands in order, respecting `produces` bindings
and the --destroyer gate for delete/discontinue/force_shutdown actions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
clanker verdacommand tree (list/get/action/balance +verda ask) plusclanker ask --verdawith keyword routing.verda-instantKubernetes cluster provider soclanker k8scan provision and pull kubeconfigs from Verda Instant Clusters.clanker_verda_ask/clanker_verda_listover MCP.Details
https://api.verda.com/v1, withexpires_in-driven refresh, 429 /Retry-After, 207 multi-status and{code,message}error decoding.~/.clanker.yaml→VERDA_*env →~/.verda/credentials(written byverda auth login).server:URL to the public IP.Test plan
make fmt vet test-short build./bin/clanker verda --helpshows the new tree./bin/clanker ask --helplists--verda./bin/clanker verda list instancesagainst a real Verda account./bin/clanker verda ask "what's my balance and running GPUs?"against a real account./bin/clanker mcp --transport http --listen :39393exposesclanker_verda_*